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Abstract 



In this paper we study 7-structures filtered by topological genus. 7-structures are a class of 
RNA pseudoknot structures that plays a key role in the context of polynomial time folding of 
RNA pseudoknot structures. A 7-structure is composed by specific building blocks, that have 
topological genus less than or equal to 7, where composition means concatenation and nesting 
of such blocks. Our main results are the derivation of a new bivariate generating function 
for 7-structures via symbolic methods, the singularity analysis of the solutions and a central 
limit theorem for the distribution of topological genus in 7-structures of given length. In our 
derivation specific bivariate polynomials play a central role. Their coefficients count particular 
motifs of fixed topological genus and they are of relevance in the context of genus recursion and 
novel folding algorithms. 

Keywords: 7-structure , Genus filtration , Irreducible shadow , Generating function , 
Singularity analysis , Central limit theorem 2010 MSC: 05A16, 92E10 



1. Introduction 



An RNA sequence is a linear, oriented sequence of the nucleotides (bases) A,U,G,C. 
These sequences "fold" by establishing bonds between pairs of nucleotides. These bonds 
cannot form arbitrarily, a nucleotide can at most establish one bond and the global con- 
formation of an RNA molecule is determined by topological constraints encoded at the 
level of secondary structure, i.e., by the mutual arrangements of the base pairs pQ. Sec- 
^ ■ ondary structures can be interpreted as (partial) matchings in a graph of permissible base 

pairs [2]. When represented as a diagram, i.e. as a graph whose vertices are drawn on a 
horizontal line with arcs in the upper halfplane on refers to a secondary structure with 
crossing arcs as a pseudoknot structure. 

Folded configurations are energetically optimal. Here energy means free energy, which 
is dominated by the stacking of adjacent base pairs and not by the hydrogen bonds of the 
individual base pairs [3], as well as minimum arc-length conditions jl]. That is, a stack is 
tantamount to a sequence of parallel arcs (i + 1, j — 1), • • • , (i + r — 1, j — r + 1)). In 

particular, only configurations without isolated bonds and without bonds of length one 
(formed by immediately subsequent nucleotides) are observed in RNA structures. For 
a given RNA sequence polynomial-time dynamic programming (DP) algorithms can be 
devised, finding such minimal energy configurations. 

The topological classification of RNA structures [5j [6] has recently been translated into 
an efficient dynamic programming algorithm [7] . This algorithm a priori folds into a novel 
class of pseudoknot structures, the 7-structures. 7-structures differ from pseudoknotted 
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Fig. 1 . The central limit distribution: we display the distribution of topological 
genera of canonical 1-structures for n = 100. 

RNA structures of fixed topological genus of an associated fatgraph or double line graph [8] 
and IS], since they have not a fixed genus. They are composed by irreducible subdiagrams 
whose individual genus is bounded by 7 and contain no bonds of length one (1-arcs), see 
Section 12.11 for details. The folding of 7-structures has led to unprecidented sensitivity 
and positive predictive value [7]. 

In this paper we study 7-structures filtered by topological genus, i.e. partial matchings 
composed by irreducible motifs of genus < 7, without 1-arcs. These motifs are called 
irreducible shadows and discussed in detail in Section 12.11 To consider the topological fil- 
tration of 7-structures is tantamount to constructing a new bivariate generating function. 
This construction recruits specific, bivariate generating polynomials that are associated to 
irreducible shadows of genus < 7, which we denote by Is 7 (z,i). For example for 7 = 1, 2 
we have 



The bivariate algebraic equations for 7-structures discovered here are instrumental for 
obtaining recursions for computing shadows of genus g from those of smaller genera. 
Similar to the Zagier-Harer generating function [H] it is a fascinating prospect to derive a 
recursion for the polynomials Is g (z,t). Here it will be vital to obtain hints for bijective 
proof hidden in the algebraic formulas. Common factors of these polynomials whose 
coefficients count numbers of irreducible shadows of fixed genus will be the key for deeper 
understanding. Results along these lines will have profound algorithmic impact and offer 
novel insights in how to fold topological 7-structures faster. 

We then study topological 7-structures from a probabilistic point of view. Regarding 
the bivariate generating functions as parameterized univariate functions we can prove a 
central limit theorem for the distribution of topological genera in 7 structures of fixed 
length n. We find that the expected genus of a canonical 1-structure, i.e. a structure that 
does not contain any isolated arcs, is given by 0.04123 n with a variance of 0.0093 n. 
Thus the expected genus is linear in n and in particular a canonical 1-structure of length 
100 has an expected genus of 4, see Fig. [TJ 



IsiOM) 
Is 2 (z,t) 



(i + z) 2 A 

(1 + zf z 4 (24 z + 17) (4 z + 1) t 2 + (1 + zf zH. 
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Fig. 2. Shadows: the shadow is obtained by removing all noncrossing arcs and 
isolated points and collapsing all stacks and resulting stacks into single arcs. 



2. Background 

2.1. 7-diagrams. A diagram is a labeled graph over the vertex set [n] = {1, . . . ,n} in 
which each vertex has degree < 3, represented by drawing its vertices in a horizontal 
line and its arcs where i < j, are drawn in the upper half-plane. The backbone 

of a diagram is the sequence of consecutive integers (1, ... ,71) together with the edges 
{{i, % + 1} I 1 < % < n — 1}. We shall distinguish the backbone edge {i,i + 1} from the 
arc (z, i + 1), which we refer to as a 1-arc. A stack of length r is a maximal sequence of 
"parallel" arcs, (i + 1, j — 1), • • • , (i + t — 1, j — r + 1)). A stack of length > r is 

called a r-canonical stack. In particular, a stack of length one is an isolated arc. 

The specific drawing of a diagram G in the plane determines a cyclic ordering on the 
half-edges of the underlying graph incident on each vertex, thus defining a corresponding 
fatgraph G. The collection of cyclic orderings is called fattening, one such ordering on 
the half-edges incident on each vertex. Each fatgraph G determines an oriented surface 
F(G) [TU], [11] which is connected if G is and has some associated genus g(G) > and 
number r(G) > 1 of boundary components. Clearly, F(G) contains G as a deformation 
retract [12] . Fat graphs were first applied to RNA secondary structures in [131 EH] • 

A diagram G hence determines a unique surface F(G) (with boundary). Filling the 
boundary components with discs we can pass from F(G) to a surface without boundary. 
Euler characteristic, x> an d genus, g, of this surface is given by 

x = v - e + r and g = 1 - 

respectively, where v, e, r is the number of discs, ribbons and boundary components in G, 
|12j . The genus of a diagram is that of its associated surface without boundary. 

A shadow is a diagram without noncrossing arcs, isolated vertices and stacks of length 
greater than one. The shadow of a diagram of genus g is obtained by removing all 
noncrossing arcs, deleting all isolated vertices and collapsing all induced stacks to single 
arcs, see Fig. [2] 

The shadow of a diagram G, cr(G), can possibly be empty. Furthermore, projecting 
into the shadow does not affect genus. Any shadow of genus g over one backbone contains 
at least 2g and at most (Qg — 2) arcs. In particular, for fixed genus g, there exist only 
finitely many shadows [T5] . 

A diagram is called irreducible, if and only if for any two arcs, a±,ak contained in 
E, there exists a sequence of arcs (a±, at2, ■ ■ ■ , ajt-i, ajt) such that (a*, are crossing. 
Irreducibility is equivalent to the concept of primitivity introduced by [5]. According to 
[To] , for arbitrary genus g and 2g < I < (6g — 2), there exists an irreducible shadow of 
genus g having exactly £ arcs. 
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Fig. 3. The shadow cr(G) of a diagram G decomposes into a set of irreducible 
shadows, which implies that G is a 2-diagram. 
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Fig. 4. er-intervals and P-intervals. 

The shadow cr(G) of a diagram G decomposes into a set of irreducible shadows. We 
shall call these shadows irreducible G-shadows. A diagram, G, is a 7-diagram if and only 
if for any irreducible G-shadow, G', g{G') < 7 holds, see Fig. El 

We call r-canonical 7-diagrams without arcs of the form + 1) (1-arcs) r-canonical 
7-structures and their set is denoted by Q T<T The set of 7-diagrams that contain only 
vertices of degree three (7-matchings) is denoted by % 1 and the set of 7-matchings that 
contain only stacks of length one (7-shapes) is denoted by S 1 . 

A stack of length r, (i + l,j — 1), . . . , (i + r — l,j — r+ 1)) induces a sequence of 

pairs (([i, i + 1], [j, j - 1]), ([i + + 2], [j - 1, j - 2]) . . . ). We call any of these 2(r - 1) 
intervals a P-interval. The interval [i + r — 1, j — r + 1] is called a a-interval, see Fig. |H 
We distinguish these two types of interval for special manipulation in the inflation step. 

2.2. Some generating functions. Let i(g, n) denote the number of irreducible shadows 
of genus g with n arcs. Since for fixed genus g there exist only finitely many shadows we 
have the generating polynomial of irreducible shadows of genus g 

63-2 

n=2g 

For instance, for genus 1 and 2 we have 
h(z) = z 2 + 2z 3 + z\ 

I 2 (z) = 172 4 + 1602 5 + 5662 6 + 10042 7 + 96l2 8 + 4762 9 + 962 10 . 
We denote the bivariate generating polynomial of irreducible shadows of genus < 7 by 

ls y (z,t) = ^l g {z)t 9 . 

9<1 



For example for 7 = 1 and 7 = 2 we have 
Isi(*,t) = (l + z) 2 z 2 t, 

ls 2 (z,t) = (1 + z) 4 z^ (2A z + 17) (4 z + l)t 2 + (1 + z) 2 z 2 t. 

Let hj(g,n) denote the number of 7-matchings of genus g with n arcs. The univariate 
and bivariate generating functions of 7-matchings are given by 

n g g,n 

Let s 1 (g,n,m) denote the number of 7-shapes of genus g with n arcs and m 1-arcs with 
generating functions of 

S 7 (z,i, e) = s-y(g,n,m)t 9 z n e m . 

g,n,m 

Finally, G T)7 (g,n) denotes the number of r-canonical 7-structures of genus g with n 
vertices with generating function 

G T , 7 (z,t) = ^G T , 7 (<7,n)tV\ 



9,™ 



2.3. A central limit theorem. We next discuss a central limit theorem due to Bender 
|16j . It is proved by analyzing the characteristic function using the Levy-Cramer Theorem 
(Theorem IX.4 in [T7|). 

Theorem 1. Suppose we are given the bivariate generating function 



f(z,u)= £/(n,f) 



z n u\ 



n,t>0 



where f(n,t) > and f(n) = J2tf( n ^)- Let X„ be a r.v. such that P(X n = t) 
f(n,t)/f(n). Suppose 

[z n ]f(z, e s ) = c(s) n a 7(5)"™ (l + o(~ 

uniformly in s in a neighborhood of 0, where c(s) is continuous and nonzero near 0, a 
is a constant, and 7(5) is analytic near 0. Then there exists a pair (fi, a) such that the 
normalized random variable 

_ X ra -\in 
n ~ / 2 

converges in distribution to a Gaussian variable with a speed of convergence O(n-i). That 
is we have 



1 FX 



lim P (X; <x) = —= I e"2 c dc 

n->oo V27T 
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where fi and a are given by 



Y(0) . a fj'(0)\ 2 7"(0) 
a = — — and a = — — — r^-. 

7(0) V7(0)y 7(0) 




1 2 3 4 5 6 



7 8 9 10 11 12 13 14 15 16 17 18 19 20 



21 22 23 24 25 26 27 28 



Fig. 5. A 2-matching G containing the maximal arcs (1,6), (7,19), (17,20), 
(21, 26), (23, 28), five components and the three blocks G[l, 6], G[7, 20], G[21, 28]. 



3. 7-MATCHINGS 

3.1. 7-matchings. Given a matching G, an arc is called maximal if it is maximal with 
respect to the partial order 



(ij)<(i',f) 



i' <i A j < f. 



The arcs-set of a matching G gives rise to a (combinatorial) graph <p(G) obtained by 
mapping each labeled arc a into the vertex ip(a) = v a connecting any two such vertices 
iff the corresponding arcs are crossing in G, </?: G — > <p(G). A component of a matching 
G is a set of arcs A such that ip(A) is a component in (p(G). Considering the left- and 
rightmost endpoints of a component containing some maximal arc induces a partition of 
the backbone into subsequent intervals. G induces over each such interval a sub-matching, 
to which we refer to as a block. By construction, all maximal arcs of a fixed block are 
contained in a unique component, see Fig. |5j 

Any 7-matching can be decomposed by iteratively removing components from top to 
bottom as follows: 

• one decomposes a 7-matching into a sequence of blocks 

• for each block, one removes the unique component containing all its maximal arcs. 
Each component can be viewd as a matching by considering it over a backbone. In this 
context any such component has genus < 7. By construction the shadow of a component 
is always irreducible, see Fig. El 

Accordingly, any diagram G can iteratively be decomposed by first removing all isolated 
vertices and second by removing components iteratively according to the above procedure. 

The genus of a 7-matching is additive in the context of the above decomposition. 

Proposition 1. Suppose a matching G decomposes into a series of components Gi, . . . , G n . 
Then 



g (G)=g(G 1 ) + --- + g(G n ). 



Proof. It is suffice to prove the case of a matching G generated by concatenating or nesting 
two components G\ and G^- 

Let n, ni and n 2 denote the number of arcs in G, G\ and G2, respectively, r, 77 and r2 
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Fig. 6. Decomposition of a 2-matching by iteratively removing components from 
top to bottom. 

denote the number of boundary components in G, G\ and G2, respectively. We have 

2 0(G) = 1+n-r, 

(3.1) 2g(G 1 ) = l + n 1 -r 1 , 

2g(G 2 ) = l+n 2 -r 2 . 

Observe that the following relations hold 

(3.2) n = U\ + ?7,2, r = r\ + r 2 — 1. 
Combining equations (13 .ip and (13. 2p . we have 

0(G) = g(G 1 ) + g{G 2 ) 

completing the proof. □ 

3.2. A functional equation. In [18], the generating function of 7-matchings has been 
computed. In the following we shall refine this generating function by its inherent genus- 
filtration. 

Theorem 2. Let R = 7,[z,t]. Then the following assertions hold: 

(a) the bivariate generating function of j-matchings, H 7 (z,t), satisfies 

(3.3) H 7 (*, ty 1 = l-(z H 7 (*, t) + H 7 (*, t)" 1 Is 7 (t^|^, tj ) , 
or equivalently, 

(3.4) H 7 (., - * H 7 (*, tf - Is 7 ^ " H ^ 2 t)2 , t) = 1. 

In particular, there exists a polynomial P 7 (z,t, X) G R[X] of degree (127 — 2), snc/i t/iai 
P 7 (z,*,H 7 (*,t)) = 0. 

(b) eq. ^3.$ determines H-y(z,t) uniquely. 



Fig. 7. Step I: inflation of each arc in a into a sequence of induced arcs. 



Proof. We distinguish the classes of blocks into two categories characterized by the unique 
component containing all maximal arcs (maximal component). Namely, 

• blocks whose maximal component contains only one arc, 

• blocks whose maximal component is an (nonempty) irreducible matching. 

In the first case, the removal of the maximal component (one arc) generates again a 
7-matching, which translates into the term 

z H 7 (z, t). 

Let T(z,t) denote the (genus filtered) generating function of blocks of the second type. 
The decomposition of 7-matchings into a sequence of blocks implies 

H 7 (z, ty 1 = 1 - (z H 7 (z, t) + T(z, t)) . 

Let a be a fixed irreducible shadow of genus g having n arcs. Let T CT (z, t) be the generating 
function of blocks, having a as the shadow of its unique maximal component. Then we 
have 

T(z,t)= ^TVOM), 

crgX 7 

where Z 7 denotes the set of irreducible shadows of genus < 7. We shall construct T a in 
three steps using arcs, 71, sequences of arcs, /C, induced arcs, Af, sequence of induced arcs, 
Ai, and arbitrary 7-matching, 7i. 

Step I: We innate each arc in a into a sequence of induced arcs, see Fig. [7J An induced 
arc, i.e. an arc together with at least one nontrivial 7-matching in either one or in both 
P-intervals 

M = n x ({u - 1) + (n - 1) + (h - 1) 2 ) =tzx (h 2 - 1) . 

Clearly, we have for a single induced arc N(z, t) = z (H 7 (z, t) 2 — 1), guaranteed by 
Proposition [TJ and for a sequence of induced arcs, M. = Seq(A/"), where 

M(z,t) = 



1- z(U^(z,t) 2 -1) 

Inflating each arc into a sequence of induced arcs, R n x Ai n , gives the corresponding 
generating function 

z n M( Zl t) n - 



1-^(H 7 (,M) 2 
since, by Proposition HJ the genus is additive. 
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Fig. 8. Step II: inflation of each arc in the component with shadow a into stacks. 



Fig. 9. Step III: insertion of additional 7-matchings at exactly (2ra — 1) u-intervals. 

Step II: We inflate each arc in the component with shadow a into stacks, see Fig. [HJ 
The corresponding generating function is 

z 
1-z 



Step III: We insert additional 7-matchings at exactly (2n — 1) cr-intervals, see Fig. [9j 
Accordingly, the generating function is H 7 (z,t) 



2n-l 



Combining these three steps and utilizing additivity of the genus, we arrive at 



Z 



\2n-l 



7V 

2 



t) 



Therefore 



We derive 



T(z,t) = £ t) 

T(M)^H 7 (M)-Is( i :^g )2 . f ). 



completing the proof of eq. ( 13.31) . 

Note that Is 7 (z, t) are polynomials in z of degree 67 — 2. Eq. ( 13.41) gives rise to the 
polynomial 

P 7 (z, t, X) = (1 - *X 2 )*>- 2 (-l + X - zX 2 ) - (1 - zX^ls, (-^^^ , 
where deg(P 7 (z,t,X)) = 127 — 2. Then P 7 (z, t, H 7 (z, t)) = 0, whence (a). 
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It remains to prove (b). Eq. ( 13. 4p implies 

(1 - zH 7 (z, t) 2 ) 67 " 2 (-l + H 7 (z, t) - 2H 7 (z, tf) 
-(1 - zH (z t) 2 ) 67 " 2 Is ( zYL i^ t ) 2 +) =0 

and consequently 

67-2 , _ V 

H 7 (^) = -H 7 (M) E P 7 ^ )(-zIl 7 (z,tfy 

i=i ^ z ' 

( 3 - 5 ) + (1 + z H 7 (z, t) 2 ) (1 - zH 7 ( Z , t) 2 ) 6 ^" 2 

+ (l-zH (z t) 2 )^~ 2 ls ( zYl ^ z ^ 2 t \ 

All coefficients of H y (z,t) in the RHS of eq. (13.51) . are polynomials in z of degree > 1, 
whence any [z n t 9 ]R 1 (z,t) for n > (67 — 1) can be recursively computed. Accordingly, 
eq. ( 13. 5 P determines H 7 (z, t) uniquely. □ 

Remark: Proposition [1] makes the additional variable marking the genus compatible 
with the inflation procedure in Theorem [2j 
In particular for 7 = 1 and 7 = 2 we have 

Px(z,t,X) = -l + X + 3X 2 z-AX 3 z-2X 4 z 2 -X 4 tz 2 + 6X 5 z 2 
-2 X V - 4 X 7 z 3 + 3 X 8 z 4 + X V - X 1 V, 

P 2 (z,t,X) = -l + X + 9X 2 z-10X 3 £-35XV-X 4 tz 2 + 45XV 

+75 XV + 6X 6 tz 3 - 120 XV - 90 X 8 z 4 - 15 Xhz 4 - 17 X 8 t 2 z 4 
+210 X V + 42 X w z 5 + 20 X 10 tz 5 - 58 X 10 tV - 252 X u z 5 
+42 X 12 z 6 - 15 X l2 tz 6 - 21 X 12 tV + 210 X 13 / - 90 X 14 / 
+6 X 14 tz 7 - 120 X 1 V + 75 X 16 z 8 - X w tz 8 + 45 X 17 z 8 
-35 X 18 z 9 - 10 X 1 V + 9 X 2 V° + X 2 V° - X 2 V 1 . 

3.3. Singularity analysis. The bivariate generating function H 7 (z, t) is not explicitely 
known but is completely characterized by the functional equation established in Theo- 
rem [21 

In the following we employ this implicit equation to obtain key information about the 
singular expansion of H 7 (z, £), where we consider the latter as a univariate generating 
function parameterized by t. 

Theorem 3. [T7] Let F(z,t) be a bivariate function that is analytic at (0,0) and has 
non-negative coefficients. Assume that F(z,t) is one of the solutions y of a polynomial 
equation 

Hz,t,y) = 0, 
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where $ is a polynomial in y, such that l,y) satisfies the conditions of Theorem^ 
Define the resultant polynomial 

A(z, t) = R U(z, t, y), t, y),y\ 

Let p be the root of A(z, I), so that y(z) := F(z, 1) is singular at z = p and y(p) = n. Let 
p(t) be the unique root of the equation 

A(p(t),t) = 0, 

analytic at 1, such that p(l) = p. Then F(z,t) has the singular expansion 

F( Z , t) = 7T(t) + X(t) (p(t) - z)' (1 + o(l)) , 
where ir(t) and X(t) are analytic at 1 such that 7r(l) = ir and A(l) ^ 0. Furthermore 

1 



[z n ]F(z, t) = c{t) n-2 P {t)- n 1 + 



uniformly for t restricted to a small neighborhood of 1, where c(t) is continuous and 
nonzero near 1. 

The following proof is derived from Proposition IX. 17 and Theorem IX. 12 in |17j . 

Proof. By Theorem [91 the function y(z) = F(z, 1) has a square-root singularity at z = p 
and admits a singular expansion of the form 

F(z, 1) = 7r + A(p — z)i + 0(p — z), for some nonzero constant A. 
Singularity analysis then implies the estimate 

[z n }F(z, 1) = crrtp- n h+oQ 

All that is needed now is a uniform lifting of relations above, for t in a small neighborhood 
of 1. 

First, the polynomial $(p, l,y) has a double (not triple) zero at y = ir, so that 

^(p,l,y)) =0, f|i$(p,l )2/ )) ^0. 



5 2/ Jy=n / ^ 

Thus, the Weierstrass Preparation Theorem gives the local factorization at (p, 1, ir) 

®{z,t,y) = (y 2 + Cl (z,t)y + c 2 (z,t))A(z,t,y), 
where A(z, t,y) is analytic and non-zero at (p, l,ir) while ci(z,t), C2(z,t) are analytic at 

0M) = (p,i). 

From the solution of the quadratic equation, we must have locally 



y 



Consider first (z,t) restricted by < z < p and < t < 1. Since F(z,t) is real there, 
we must have ci(z, t) 2 — 4c2(z,t) also real and non-negative. Since F(z,t) is continuous 
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and increasing with z for fixed t, and since the discriminant ci(z, t) 2 — 4c2(z, t) vanishes 
at (p, 1), the minus sign has to be constantly taken. In summary, we have 

F(z,t) = - (-c x {z,t) - v^MF 7 ^)) • 

Set C(z,t) := Ci(z,t) 2 — 4c 2 (z,t). The function C(z, 1) has a simple real zero at z — p. 
Thus, by the analytic implicit function theorem, there exists for t sufficiently close to 1, 
a unique simple root p(t) of the equation C(p(t),t) = 0, which is an analytic function 
of t such that p(l) = p. Set C(z,t) := C(z p(t),t), where C(z,t) is analytic at (p, 1). 
Consequently C(z, t) is analytic at (1, 1), since it is a composition of two analytic function. 
When t sufficiently close to 1, (7(1, t) = , since C(p(t),t) = 0. Taking its singular 
expansion of C(z,t)i at z = 1, 

6(z,o* = (i-*)*x;a»(f)(i-s) B , 

n>l 

where C n (t) is analytic around 1. For t — > 1, the singular expansion of C(z, £)a at z — pit) 
is given by 

C{z,t)$ = (p(t) - z)^C n (t)(p(t) - z) n , 

n>l 

where C n (t) is analytic around 1. 

Then, since ci(z,t) and C2(z,t) are analytic, F(z,t) has the singular expansion 

F(Z, t) = 7T(t) + X(t) (p(t) -z)*(l + 0(1)) , 

where ir(t) and X(t) are analytic at 1 such that 7r(l) = it and A(l) ^ 0. Therefore transfer 
theorems and the uniformity property of singularity analysis [17] imply that 

[z n }F(z,t)=c(t)n-lp(t)- n fl + O 

uniformly for t restricted to a small neighborhood of 1, where c(t) is continuous and 
nonzero near 1. □ 

Combining Theorem [2] and Theorem [3l we derive 

Theorem 4. Let 1 < 7 < 8 and A 7 (z, s) be the resultant of P 1 (z,e s ,X) and P 7 (z, e s , X) 
as polynomials of X . Let p 7 (s) be the unique root of the equation A 7 (p 7 (s),s) = 0, 
analytic at 0. 

(a) p 7 (s) is the dominant singularity ofH^(z,e s ), 

(b) then H 7 (z, e s ) has the expansion 

H 7 (z, e s ) = vr 7 (s) + A 7 (s) (p 7 (s) - z)3 (1 + o(l)) , 

where 7r 7 (s) and A 7 (s) are analytic at swca taat 7r(0) = 7r and A(0) ^ 0. 

(c) tae coefficients o/H 7 (2,e s ) are asymptotically given by 

(3.6) M H 7 (,,e*) = «,W»^y(l + 0(;)), 
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uniformly for s restricted to a small neighborhood of 0, where c 7 (s) is continuous and 
nonzero near 0. 

Proof. By Theorem [21 H 7 (z,e s ) satisfies the algebraic equation P^(z,e s ,X) = 0. Theo- 
rem [10] implies that for 1 < 7 < 8, H 7 (z, 1) satisfies the conditions required by Theorem[9l 
This puts us in position to apply Theorem [3l from which the theorem then follows. □ 



4. 7-STRUCTURES 

4.1. Combinatorics of 7-structures. 
Lemma 1. For any 7 > 1, we have 

(4-1) S 7 (z,t,e) = \ +Z uJ * {1 + Z) t 

1 + 2z — ze \ (1 + 2z — zey 

Proof. Note that collapsing the stacks, adding or deleting 1-arcs do not change the genus. 
Therefore, we can extend the function equation of Lemma 3 in [18] to the bivariate case 
with parameter t marking the genus. □ 

Using symbolic methods we can conclude from Lemma [J 

Lemma 2. Let X be a fixed '-/-shape of genus g with s > 1 arcs and m > 1-arcs. Then 
the generating function of r -canonical 7- diagrams containing no 1-arc that have shape A 
is given by 



x - ( z2t Y 



z m t 9 . 



In particular, G^ 7 (z,t) depends only upon the genus, the number of arcs and 1-arcs in A. 
The generating function of r canonical 7-structures now follows: 

Theorem 5. Suppose 7, r > 1 and let u T (z) = z irl z 5 +1 ■ Then the generating function 
G T> ry(z,t) is algebraic and given by 

(4.2) G T Jz,t) = . . } -hJ t \ 

V 7 ' 7V ^ U T (Z)Z 2 -Z + 1 ' \( Ur ( z ) z 2- Z + l) 2 ' J 

As for the proof of Theorem [5J Proposition [1] guarantees that the topological genus is 
soley generated by the crossing pattern of the components and not affected by inflation 
or the adding of vertices. It is therefore straightforward to extend the functional equation 
established in Theorem 3 in [18] to the bivariate case. 



4.2. The genus distribution of 7-structures. In this section we study the random 
variable X n >Ti7 having the distribution 

P(AV, 7 -g)- Gr>) , 

where g = 0,1, . . . , |_f J • We shall prove the following 
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Theorem 6. There exist a pair (// T)7 , oy, 7 ) swc/i that the normalized random variable 

n. 



n <T T 



T,7 

converges in distribution to a Gaussian variable with a speed of convergence 
is we have 



.I„2 



lim p < x = / e -t c ^c 



^,7 / 

where /i Tj7 and are gwen fry 



(43) , s = My.M) 



In Tables [U we present the values of the pairs (/x r , 7 , 0V )7V 



TABLE 1. Genus distribution: The central limit theorem for the genus in 7- 
structures. We list // Tj7 and <r^ 7 derived from eq. (|4.3p . 







T - 


= 1 


r : 


= 2 


T - 


= 3 








<7 


/^T,7 


<7 


/*T,7 


<7 


7 = 


1 


0.091240 


0.021067 


0.041235 


0.009358 


0.026632 


0.006043 


7 = 


2 


0.112037 


0.022088 


0.050436 


0.009768 


0.032564 


0.006288 






r = 


= 4 


r : 


= 5 


r : 


= 6 








<7 




<7 




<7 


7 = 


1 


0.019706 


0.004481 


0.015666 


0.003571 


0.013017 


0.002974 


7 = 


2 


0.024104 


0.004657 


0.019170 


0.003709 


0.015935 


0.003087 



Theorem [6] follows from Theorem [T] setting 

/(z,e s ) = G r , 7 (z,e s ) 
and we shall subsequently verify the applicability of the latter. 

The crucial prerequisite for applying Theorem [1] is accomplished by Theorem [8] which 
in turn is implied by Theorem [7] below, which guarantees 

(4.4) [2 n ]H 7 (^(^), e s ) = A(s) n~i f^y) ^l + O f ^ ^ , A(s) continuous, 

for 1 < 7 < 7. Once eq (14.4R is established, the analyticity of n(s) is guaranteed by the 
analytic implicit function theorem [TT] . 

Note that Theorem H] already guarantees that the coefficients of H 7 (z,e s ) are asymp- 
totically given by 

(4.5) M H 7 (,,e.) = ^ W „-^(-^)"(l + (I)), 
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uniformly for s restricted to a small neighborhood of 0, where c 7 (s) is continuous and 
nonzero near 0. However, according to Theorem Owe have 

G T)7 (z, t) = - - ; - H 7 ( - — — - . 2 ' : 



u T (z)z 2 -z + 1 7 V {u T {z)z 2 -z + iy 

Consequently we have to establish uniform convergence for generating functions of the 
form Hy(i/j(z), e s ), where ip{z) is analytic for for \z\ < r. 

Theorem 7. Suppose 1 < 7 < 7. Let i(j(z) be an analytic function for \z\ < r, such that 
■0(0) = 0. In addition suppose k(s) is the unique dominant singularity of ' H 7 (■?/;( z), e s ) 
and analytic solution of iJ)(k(s)) = p-y(s), \k{s)\ < r, -^ip(n(s)) ^ for \s\ < e. Then 
H 7 (^(,z), e s ) has a singular expansion and 

[z n ]H. 7 (i/}(z), e s ) = A(s) n~5 ^_ ^1 + f-J ^ for some continuous A(s) G C, 

uniformly in s contained in a small neighborhood ofO. 

We prove Theorem [7| in Section O 

We proceed by applying Theorem [7] in order to derive an asymptotic formula for the 
coefficients of G Tj7 (z, e s ) viewed as a univariate generating function, parameterized by e s . 
The key point here is that this formula is uniform in the parameter s, close to 0. 

Theorem 8. For 1 < 7 < 8 and 1 < r < 10, G Tj7 (2:, e s ) has a unique dominant singu- 
larity, 9 Tn (s), such that for s restricted to a small neighborhood of 0: 

(1) 8t,j( s ) is analytic, 

(2) 9 T ^(s) is the solution of minimal modulus of 

u T {z)z 2 
(u T {z)z 2 -z+lf HlK 

(3) 

uniformly for s restricted to a small neighborhood ofO, where k T>7 (s) is continuous and 
nonzero near 1. 

Proof. The first step is to establish the existence and uniqueness of the dominant singu- 
larity T)7 (s). 
We denote 

(4.6) T (z) = u T (z)z 2 -z + l, 

(4.7) Mz) = 

{u T {z)z 2 -z+lf 

and consider the equations 

Vl<7<8; F T>7 (z, s) = ip T (z) - p 7 (s), 
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where p 7 (s) is defined in Theorem HI Theorem [5] and Theorem [7] imply that the singular- 
ities of G T7 (z, e s ) are are contained in the set of roots of 

F ri7 O,s) = and tf T (z) = 

where 1 < 7 < 8. Let # T , 7 denote the solution of minimal modulus of 

F T , 7 0,0) =Vr(«)-p 7 (0) =0. 

We next verify that, for sufficiently small e 7 > 0, |z — T)T | < e 7 , |s| < e 7 , the following 
assertions hold 

• fF T , 7 (0 r , 7 ,O)/O 

• -j^F Tn (z,s) and ^F rr/ (z,s) are continuous. 

The analytic implicit function theorem, guarantees the existence of a unique analytic 
function T)7 (s) such that, for \s\ < 6j, 

F T>7 (6 Ta (s), s) = and #t, 7 (0) = T , 7 - 

Analogously, we obtain the minimal solution 5 T of i? T (z) = 0. We next verify that 
the unique dominant singularity of G Tj7 (2;, 1) is the minimal positive solution T)7 of 
F T)7 (z,0) = and subsequently using an continuity argument. Therefore, for sufficiently 
small e where e < ej, |s| < e, the modulus of Tj7 (s), for 1 < 7 < 8 and <5 r are all strictly 
larger than the modulus of 8 T)1 (s). Consequently, 6* Tj7 (s) is the unique dominant singu- 
larity of G Ti7 (z, e s ). 

Claim. There exists some continuous fc r7 (s) such that, uniformly in s, for s in a neigh- 
borhood of 

^,y ) =M,).-l( j i 5 ) , (i + o(i)). 

To prove the Claim, let r be some positive real number such that 9 T)1 < r < 5 T . For 
sufficiently small e > and \s\ < e, 

\0tM\ < r. 

Then ip T (z) and ^-^y are all analytic in \z\ < r and Vv(0) = 0. Since T ^(s) is the unique 
dominant singularity of 

G r>7 (z, e s ) = H 7 (^(2), t) , 

satisfying 

^ T (0 T)7 (s)) = p 7 (s) and |# Tj7 (,s)| < r, 
for \s\ < e. For sufficiently small e > 0, ^F Tr/ (z,s) is continuous and ^F Tj7 (6 i Ti7 , 0) 7^ 0. 
Thus there exists some e > 0, such that for \s\ < e, J^F T)7 (0 T)7 (s), s) 7^ 0. According to 
Theorem [TJ we therefore derive 

[*"ie™(*,0 = («^))"( 1 + (9)' 

uniformly for s restricted to a small neighborhood of 0, where & T , 7 (s) is continuous and 
nonzero near 1. 
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Fig. 10. The effect of minimum stack-size: the shift of the central limit distribu- 
tion of topological genera of 2-canonical, 3-canonical and 4-canonical 1-structures 
for n = 100. 

□ 



5. Discussion 

Our results trigger a series of intriguing research perspectives. The genus filtration 
and in particular the emergence of the bivariate polynomials Is g (z,t) gives rise to the 
question whether we can find bijective, constructive proofs in order to establish recurrences 
with respect to the topological genus g. It would be fascinating to be able to construct 
genus (7 + l)-structures from lower genera. Such genus-recurrences could have profound 
algorithmic impact and be of great practical value. 

It is furthermore now clear how to introduce a genus filtration into 7-interaction struc- 
tures [12]. A result of [H] indicates how this can be derived. There it is proved how to 
compute the topological genus of a 7-interaction structure. The latter formula is in dif- 
ference to the case of a single backbone not simply the sum of the genera of its irreducible 
shadows. This computation can be weaved into the combinatorial construction presented 
here in order to refine our results by the topological genus. 

Our analysis of the genus-distribution in 7-structures shows how the minimum stack size 
in these structures affects the expected genus. While for canonical 1-structures of length 
100 we have an expected genus of 4 and this drops to 2.6 when requiring a minimum 
stack-size of 3, even to 2.0 for a minimum stack-size of 4, see Fig. [TOJ In view of the fact 
that natural RNA structures are typically low energy structures and energy is dominated 
by the stacking of adjacent base pairs and not by the hydrogen bonds of the individual 
base pairs [3], as well as minimum arc-length conditions [I], our results provide some 
insight why relatively low genera are being observed in natural RNA structures. 
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6. Appendix 

6.1. Asymptotic analysis of 7-matchings. In this section we establish several results 
employed in the course of Section [3j In [18], the singular expansion of H 7 (z) has been 
computed using the method of Newton polygons and the Newton-Puiseux expansion has 
been derived. 

The following result of JXT] is based on the same arguments but obsoletes the irre- 
ducibility of the polynomials representing the algebraic equations. As a result, we can 
compute singular expansions for higher 7. 

Theorem 9. Let y(z) = ^ n>0 2/n<2 n be a generating function, analytic at 0, satisfy a 
polynomial equation 3>(z,y) = 0. Let p be the real dominant singularity ofy(z). Define 
the resultant of<f>(z,y) and -^&(z,y) as polynomial in y 

( d 
A{z) = Rl$(« lJ /),-$(2,i/), !/ 

(1) The dominant singularity p is unique and a root of the resultant A(z) and there exists 
7r = y{p), satisfying the system of equations, 

(6.1) $(p,vr)=0, %(p,n) = 0. 

(2) If§(z,y) satisfies the conditions: 

(6.2) $,(p,7r)^0, $ ra (p,7r)^0, 
then y(z) has the following expansion at p 

(6.3) y(z) = 7r + A(p — 2)2 + 0(p — z), for some nonzero constant X. 
Further the coefficients ofy(z) satisfy 

3 

[z n ]y(z) ~ cn~2p~™, n -» 00, 

for some constant c > 0. 

Proof. The proof of (1) can be found in [17] or [20] pp. 103. To prove (2), let *f>(z,y) = 
$(p — z, it — y). Immediately, we have \l/(0,0) = 0. Puiseux's Theorem [21] guarantees 
a solution of y — ir in terms of a power series in fractional powers of p — z. Note that 
equations (16.1 ft and ( 16. 2ft are equivalent to 

$(0,0) = 0, tf„(0,0) = 0, ^(0,0)^0, * ra (0,0)^0. 

Then we apply Newton's polygon method to determine the type of expansion and find the 
first exponent of z to be |. Therefore y(z) has the required form of singular expansion. 
The asymptotics of the coefficients follows from eq. (16. 3 p as a straightforward application 
of the transfer theorem ([UJ, pp. 389 Theorem VI. 3). □ 

Combining Theorem 1 in [18] and Theorem^ the asymptotic analysis of H T (z) follows, 
generalizing the results in 
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Theorem 10. For 1 < 7 < 8, let 



A 7 (z) 



R\P 7 (z,X),-^P 7 (z,X),x) 



the resultant of P 7 (z, X) and -^P 1 (z, X) as polynomials in X, and p 7 denote the real 
dominant singularity o/H 7 (z). 

(a) the dominant singularity p 7 is unique and a root of A 1 {z), 

(b) at p 7 we have 

H 7 (z) = 7r 7 + A 7 (p 7 — z) 5 + 0(p 7 — z), for some nonzero constant A 7 . 

(c) the coefficients ofH^(z) are asymptotically given by 

[z n ]H 7 (^) ~ c 7 n- 3 / 2 p7' 

for some c 7 > 0. 

Proof. Pringsheims Theorem ([T7] pp. 240) guarantees that for any 7, H-y{z) has a domi- 
nant real singularity p 7 > 0. To prove the singular expansion of the function and asymp- 
totic of the coefficients, we verify P 7 (z,X) satisfy the condition of Theorem and the 
results follow. □ 

6.2. Proof of Theorem [3 

Proof. We consider the composite function H 7 (^>(z), e s ). In view of 

[z n ]f(z,s)= 1 n [z n }f(-,s) J 

7 

it suffices to analyze the function H 7 (-^(k(s)z), e s ) and to subsequently rescale in order 
to obtain the correct exponential factor. For this purpose we set 

$(z,s) = i[>(k(s)z), 

where ip(z) is analytic in \z\ < r. Consequently ip(z, s) is analytic in \z\ < r and \s\ < e, 
for some 1 < r, < e < e, since it's a composition of two analytic functions. Taking its 
Taylor expansion at z — 1, 



(6.4) ^(z,s) = J2Ms)(l 



n>0 

where ip n (s) is analytic in \s\ < <T. The singular expansion of H 7 (x/)(z), e s ), 1 < 7 < 8, for 
z — > p 7 (s), follows from Theorem HJ and is given by 

H 7 0, e s ) = tt 7 (s) + A 7 (s) (p 7 (s) -z)*{l + o(l)) . 

By assumption, n(s) is the unique analytic solution of iJj(k(s)) = p 7 (s), for \k(s)\ < r, 
and by construction H y (^(n(s)z), e s ) = H 7 (^>(z, s), e s ). In view of eq. (16. 4p . we have for 
z — > 1 the expansion 

(6.5) ${z, s) - p 7 ( S ) = £ - *)" = Ms)(l -z){l + o(l)), 

n>l 

that is uniform in s since ip n {s) is analytic for |s| < e and "00 (s) = "0(^( s )) = P-y{ s )- 
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As for the singular expansion of H 7 (ip(z, s),e s ) we derive, substituting the eq. (16. 5 p 
into the singular expansion of H 7 (z, e s )), for z —> 1, 

7T 7 ( S )+C 7 ( S ) (1 + 0(1)), 

where c 7 (s) = A 7 (s)(— ^i(s))a and 

$i(s) = 5)1^=1 = k(s)— -0(k(s)) 7^ for |s| < e. 

Furthermore vr 7 (s) is analytic at \z\ < 1, whence [2 n ]7r 7 (s) is exponentially small compared 
to 1. Therefore we arrive at 

(6.6) [z n }H^(z,s),e s ) = [z n }c,(s) (1 - *)* (1 + (1)) 

uniformly in \s\ < e". We observe that c 7 (s) is analytic in |s| < ?. Note that a dependency 
in the parameter s is only given in the coefficients Cf.(s), that are analytic in s. Standard 
transfer theorems [17] imply that 

[z n ]H^(z, s), e s ) = A(s) n~i (l + O ) for some A(s) G C, 

uniformly in s contained in a small neighborhood of 0. Finally, as mention in the beginning 
of the proof, we use the scaling property of Taylor expansions in order to derive 

[z"]H 7 (^),e*) = (K(s)y n [z"]H 7 (^(z, S ),e s ) 

and the proof of the Theorem is complete. □ 
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